An Algorithm for Voiced / Unvoiced Decision and Pitch Estimation in Speech Feature Extraction

نویسنده

  • WANG Dong
چکیده

An algorithm which combines voice / unvoiced decision and pitch estimation is proposed in an enhanced process of MFCC feature extraction. The residual energy of LPC analysis and normalized autocorrelation are calculated and the static and dynamic thresholds are set for the voiced, unvoiced and transitional decision. Thus speech is divided into three classes that are voiced, unvoiced and transitional. Then the pitch is estimated by a dynamic programming (DP) algorithm. In the following harmonic peak picking process, the result is refined by the additional spectral information. The algorithm is empowered by the finite state machine (FSM) embedded in U/V decision which can convert the static thresholds to dynamical variable thresholds and represent the actual speech more exactly. Experiments also show that performance gains of word recognition rate from 71.49% to 74.42% in the National 863 standard Mandarin speech Corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improve the Implementation of Pitch Features for Mandarin Digit String Recognition Task

Mandarin digit string recognition (MDSR) is a difficult task in the field of automatic speech recognition (ASR) and using pitch feature can significantly improve the performance. In conventional methods of pitch feature extraction, random value is commonly used as pitch output in unvoiced (UV) frames, which causes serious statistical confusion between voiced (V) and UV units and incurs abnormal...

متن کامل

Your Wavelet Based Pitch Detection and Voiced/unvoiced Decision

This paper describes property of the sudden change of a speech signal on its Glottal Closure Instant (GCI) and thereby discusses the principle of the localization of wavelets in both time and frequency domains. Based on this discussion, an algorithm for voiced/unvoiced segment decision and pitch detection is presented.

متن کامل

A novel approach of low bit-rate speech coding based on sinusoidal representation and auditory model

In this paper, a new auditory spectrum based speech feature is proposed using sinusoidal representation and auditory model. The feature is optimized using the properties of auditory perception and masking. After quantizing and encoding the optimized feature parameters, a new speech-coding algorithm with average bit-rate of 3.25kbps is developed. The experimental results show that the synthetic ...

متن کامل

An algorithm for multi-pitch tracking in co-channel speech

Most multi-pitch algorithms are tested for performance only in voiced regions of speech, and are prone to yield pitch estimates even when the participating speakers are unvoiced. This paper presents a multi-pitch algorithm that detects the voiced and unvoiced regions in a mixture of two speakers, identifies the number of speakers in voiced regions, and yields the pitch estimates of each speaker...

متن کامل

Decomposition of Speech into Voiced and Unvoiced Components Based on a Kalman Filterbank

We present a novel method for decomposing speech into signals representing the voiced and unvoiced components of speech. The method involves first demodulating the variations in spectral envelope, energy and pitch, and then applying a bank of Kalman filters to separate the harmonic and non-harmonic components of the signal. The use of Kalman filters relies on a state-space representation of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002